157 research outputs found
Improv Chat: Second Response Generation for Chatbot
Existing research on response generation for chatbot focuses on \textbf{First
Response Generation} which aims to teach the chatbot to say the first response
(e.g. a sentence) appropriate to the conversation context (e.g. the user's
query). In this paper, we introduce a new task \textbf{Second Response
Generation}, termed as Improv chat, which aims to teach the chatbot to say the
second response after saying the first response with respect the conversation
context, so as to lighten the burden on the user to keep the conversation
going. Specifically, we propose a general learning based framework and develop
a retrieval based system which can generate the second responses with the
users' query and the chatbot's first response as input. We present the approach
to building the conversation corpus for Improv chat from public forums and
social networks, as well as the neural networks based models for response
matching and ranking. We include the preliminary experiments and results in
this paper. This work could be further advanced with better deep matching
models for retrieval base systems or generative models for generation based
systems as well as extensive evaluations in real-life applications
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
Neural extractive summarization models usually employ a hierarchical encoder
for document encoding and they are trained using sentence-level labels, which
are created heuristically using rule-based methods. Training the hierarchical
encoder with these \emph{inaccurate} labels is challenging. Inspired by the
recent work on pre-training transformer sentence encoders
\cite{devlin:2018:arxiv}, we propose {\sc Hibert} (as shorthand for {\bf
HI}erachical {\bf B}idirectional {\bf E}ncoder {\bf R}epresentations from {\bf
T}ransformers) for document encoding and a method to pre-train it using
unlabeled data. We apply the pre-trained {\sc Hibert} to our summarization
model and it outperforms its randomly initialized counterpart by 1.25 ROUGE on
the CNN/Dailymail dataset and by 2.0 ROUGE on a version of New York Times
dataset. We also achieve the state-of-the-art performance on these two
datasets.Comment: to appear in ACL 201
Formality Style Transfer with Hybrid Textual Annotations
Formality style transformation is the task of modifying the formality of a
given sentence without changing its content. Its challenge is the lack of
large-scale sentence-aligned parallel data. In this paper, we propose an
omnivorous model that takes parallel data and formality-classified data jointly
to alleviate the data sparsity issue. We empirically demonstrate the
effectiveness of our approach by achieving the state-of-art performance on a
recently proposed benchmark dataset of formality transfer. Furthermore, our
model can be readily adapted to other unsupervised text style transfer tasks
like unsupervised sentiment transfer and achieve competitive results on three
widely recognized benchmarks
Selective Encoding for Abstractive Sentence Summarization
We propose a selective encoding model to extend the sequence-to-sequence
framework for abstractive sentence summarization. It consists of a sentence
encoder, a selective gate network, and an attention equipped decoder. The
sentence encoder and decoder are built with recurrent neural networks. The
selective gate network constructs a second level sentence representation by
controlling the information flow from encoder to decoder. The second level
representation is tailored for sentence summarization task, which leads to
better performance. We evaluate our model on the English Gigaword, DUC 2004 and
MSR abstractive sentence summarization datasets. The experimental results show
that the proposed selective encoding model outperforms the state-of-the-art
baseline models.Comment: 10 pages; To appear in ACL 201
Unsupervised Machine Commenting with Neural Variational Topic Model
Article comments can provide supplementary opinions and facts for readers,
thereby increase the attraction and engagement of articles. Therefore,
automatically commenting is helpful in improving the activeness of the
community, such as online forums and news websites. Previous work shows that
training an automatic commenting system requires large parallel corpora.
Although part of articles are naturally paired with the comments on some
websites, most articles and comments are unpaired on the Internet. To fully
exploit the unpaired data, we completely remove the need for parallel data and
propose a novel unsupervised approach to train an automatic article commenting
model, relying on nothing but unpaired articles and comments. Our model is
based on a retrieval-based commenting framework, which uses news to retrieve
comments based on the similarity of their topics. The topic representation is
obtained from a neural variational topic model, which is trained in an
unsupervised manner. We evaluate our model on a news comment dataset.
Experiments show that our proposed topic-based approach significantly
outperforms previous lexicon-based models. The model also profits from paired
corpora and achieves state-of-the-art performance under semi-supervised
scenarios
Dictionary-Guided Editing Networks for Paraphrase Generation
An intuitive way for a human to write paraphrase sentences is to replace
words or phrases in the original sentence with their corresponding synonyms and
make necessary changes to ensure the new sentences are fluent and grammatically
correct. We propose a novel approach to modeling the process with
dictionary-guided editing networks which effectively conduct rewriting on the
source sentence to generate paraphrase sentences. It jointly learns the
selection of the appropriate word level and phrase level paraphrase pairs in
the context of the original sentence from an off-the-shelf dictionary as well
as the generation of fluent natural language sentences. Specifically, the
system retrieves a set of word level and phrase level araphrased pairs derived
from the Paraphrase Database (PPDB) for the original sentence, which is used to
guide the decision of which the words might be deleted or inserted with the
soft attention mechanism under the sequence-to-sequence framework. We conduct
experiments on two benchmark datasets for paraphrase generation, namely the
MSCOCO and Quora dataset. The evaluation results demonstrate that our
dictionary-guided editing networks outperforms the baseline methods
Text Morphing
In this paper, we introduce a novel natural language generation task, termed
as text morphing, which targets at generating the intermediate sentences that
are fluency and smooth with the two input sentences. We propose the Morphing
Networks consisting of the editing vector generation networks and the sentence
editing networks which are trained jointly. Specifically, the editing vectors
are generated with a recurrent neural networks model from the lexical gap
between the source sentence and the target sentence. Then the sentence editing
networks iteratively generate new sentences with the current editing vector and
the sentence generated in the previous step. We conduct experiments with 10
million text morphing sequences which are extracted from the Yelp review
dataset. Experiment results show that the proposed method outperforms baselines
on the text morphing task. We also discuss directions and opportunities for
future research of text morphing
Faithful to the Original: Fact Aware Neural Abstractive Summarization
Unlike extractive summarization, abstractive summarization has to fuse
different parts of the source text, which inclines to create fake facts. Our
preliminary study reveals nearly 30% of the outputs from a state-of-the-art
neural summarization system suffer from this problem. While previous
abstractive summarization approaches usually focus on the improvement of
informativeness, we argue that faithfulness is also a vital prerequisite for a
practical abstractive summarization system. To avoid generating fake facts in a
summary, we leverage open information extraction and dependency parse
technologies to extract actual fact descriptions from the source text. The
dual-attention sequence-to-sequence framework is then proposed to force the
generation conditioned on both the source text and the extracted fact
descriptions. Experiments on the Gigaword benchmark dataset demonstrate that
our model can greatly reduce fake summaries by 80%. Notably, the fact
descriptions also bring significant improvement on informativeness since they
often condense the meaning of the source text.Comment: 8 pages, 3 figures, AAAI 201
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
We introduce the task of automatic live commenting. Live commenting, which is
also called `video barrage', is an emerging feature on online video sites that
allows real-time comments from viewers to fly across the screen like bullets or
roll at the right side of the screen. The live comments are a mixture of
opinions for the video and the chit chats with other comments. Automatic live
commenting requires AI agents to comprehend the videos and interact with human
viewers who also make the comments, so it is a good testbed of an AI agent's
ability of dealing with both dynamic vision and language. In this work, we
construct a large-scale live comment dataset with 2,361 videos and 895,929 live
comments. Then, we introduce two neural models to generate live comments based
on the visual and textual contexts, which achieve better performance than
previous neural baselines such as the sequence-to-sequence model. Finally, we
provide a retrieval-based evaluation protocol for automatic live commenting
where the model is asked to sort a set of candidate comments based on the
log-likelihood score, and evaluated on metrics such as mean-reciprocal-rank.
Putting it all together, we demonstrate the first `LiveBot'
AttSum: Joint Learning of Focusing and Summarization with Neural Attention
Query relevance ranking and sentence saliency ranking are the two main tasks
in extractive query-focused summarization. Previous supervised summarization
systems often perform the two tasks in isolation. However, since reference
summaries are the trade-off between relevance and saliency, using them as
supervision, neither of the two rankers could be trained well. This paper
proposes a novel summarization system called AttSum, which tackles the two
tasks jointly. It automatically learns distributed representations for
sentences as well as the document cluster. Meanwhile, it applies the attention
mechanism to simulate the attentive reading of human behavior when a query is
given. Extensive experiments are conducted on DUC query-focused summarization
benchmark datasets. Without using any hand-crafted features, AttSum achieves
competitive performance. It is also observed that the sentences recognized to
focus on the query indeed meet the query need.Comment: 10 pages, 1 figur
- …